The Sparks Foundation- GRIP- Data Science and Business Analytics Intern- October-2021 batchΒΆ

Author - Anand Bhausaheb Kharabe

Task 3: Exploratory Data Analysis - Retail

Level : Beginner

Language-Python

Software-Jupyter Notebook

Aim: To perform β€˜Exploratory Data Analysis’ on dataset β€˜SampleSuperstore’

As a business manager, try to find out the weak areas where you can work to make more profit.

DataSet can be downloaded from this link :- https://bit.ly/3i4rbWl

Importing Required LibrariesΒΆ

In [1]:
import numpy as np
import pandas as pd
import sklearn.metrics as sm
import seaborn as sns
import matplotlib.pyplot as plt
import klib
%matplotlib inline
import plotly.graph_objects as go
import plotly.express as px
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)

import warnings
warnings.filterwarnings("ignore")

Reading the Dataset using read_csv()ΒΆ

In [2]:
t3 = pd.read_csv("SampleSuperstore.csv")

Exploratory Data Analysis EDA on the dataset variable t3ΒΆ

In [3]:
t3.head() # Shows the first five rows of the data from variable t3
Out[3]:
Ship Mode Segment Country City State Postal Code Region Category Sub-Category Sales Quantity Discount Profit
0 Second Class Consumer United States Henderson Kentucky 42420 South Furniture Bookcases 261.9600 2 0.00 41.9136
1 Second Class Consumer United States Henderson Kentucky 42420 South Furniture Chairs 731.9400 3 0.00 219.5820
2 Second Class Corporate United States Los Angeles California 90036 West Office Supplies Labels 14.6200 2 0.00 6.8714
3 Standard Class Consumer United States Fort Lauderdale Florida 33311 South Furniture Tables 957.5775 5 0.45 -383.0310
4 Standard Class Consumer United States Fort Lauderdale Florida 33311 South Office Supplies Storage 22.3680 2 0.20 2.5164
In [4]:
t3.tail() # Shows the last five rows of the data from variable t3
Out[4]:
Ship Mode Segment Country City State Postal Code Region Category Sub-Category Sales Quantity Discount Profit
9989 Second Class Consumer United States Miami Florida 33180 South Furniture Furnishings 25.248 3 0.2 4.1028
9990 Standard Class Consumer United States Costa Mesa California 92627 West Furniture Furnishings 91.960 2 0.0 15.6332
9991 Standard Class Consumer United States Costa Mesa California 92627 West Technology Phones 258.576 2 0.2 19.3932
9992 Standard Class Consumer United States Costa Mesa California 92627 West Office Supplies Paper 29.600 4 0.0 13.3200
9993 Second Class Consumer United States Westminster California 92683 West Office Supplies Appliances 243.160 2 0.0 72.9480
In [5]:
t3.shape # shows the shape of the data variable in tuple format
Out[5]:
(9994, 13)
In [6]:
t3.info() # Print the summary of the dataframe
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9994 entries, 0 to 9993
Data columns (total 13 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Ship Mode     9994 non-null   object 
 1   Segment       9994 non-null   object 
 2   Country       9994 non-null   object 
 3   City          9994 non-null   object 
 4   State         9994 non-null   object 
 5   Postal Code   9994 non-null   int64  
 6   Region        9994 non-null   object 
 7   Category      9994 non-null   object 
 8   Sub-Category  9994 non-null   object 
 9   Sales         9994 non-null   float64
 10  Quantity      9994 non-null   int64  
 11  Discount      9994 non-null   float64
 12  Profit        9994 non-null   float64
dtypes: float64(3), int64(2), object(8)
memory usage: 1015.1+ KB
In [7]:
t3.describe() 
# shows the Statistical details 
Out[7]:
Postal Code Sales Quantity Discount Profit
count 9994.000000 9994.000000 9994.000000 9994.000000 9994.000000
mean 55190.379428 229.858001 3.789574 0.156203 28.656896
std 32063.693350 623.245101 2.225110 0.206452 234.260108
min 1040.000000 0.444000 1.000000 0.000000 -6599.978000
25% 23223.000000 17.280000 2.000000 0.000000 1.728750
50% 56430.500000 54.490000 3.000000 0.200000 8.666500
75% 90008.000000 209.940000 5.000000 0.200000 29.364000
max 99301.000000 22638.480000 14.000000 0.800000 8399.976000
In [8]:
t3.columns # Displays the column names of the data
Out[8]:
Index(['Ship Mode', 'Segment', 'Country', 'City', 'State', 'Postal Code',
       'Region', 'Category', 'Sub-Category', 'Sales', 'Quantity', 'Discount',
       'Profit'],
      dtype='object')
In [9]:
t3['Ship Mode'].unique() # Gives the unique values into the column Ship Mode
Out[9]:
array(['Second Class', 'Standard Class', 'First Class', 'Same Day'],
      dtype=object)
In [10]:
t3['Segment'].unique() # Gives the unique values into the column Segment
Out[10]:
array(['Consumer', 'Corporate', 'Home Office'], dtype=object)
In [11]:
t3['City'].unique() # gives the unique values into the column City
Out[11]:
array(['Henderson', 'Los Angeles', 'Fort Lauderdale', 'Concord',
       'Seattle', 'Fort Worth', 'Madison', 'West Jordan', 'San Francisco',
       'Fremont', 'Philadelphia', 'Orem', 'Houston', 'Richardson',
       'Naperville', 'Melbourne', 'Eagan', 'Westland', 'Dover',
       'New Albany', 'New York City', 'Troy', 'Chicago', 'Gilbert',
       'Springfield', 'Jackson', 'Memphis', 'Decatur', 'Durham',
       'Columbia', 'Rochester', 'Minneapolis', 'Portland', 'Saint Paul',
       'Aurora', 'Charlotte', 'Orland Park', 'Urbandale', 'Columbus',
       'Bristol', 'Wilmington', 'Bloomington', 'Phoenix', 'Roseville',
       'Independence', 'Pasadena', 'Newark', 'Franklin', 'Scottsdale',
       'San Jose', 'Edmond', 'Carlsbad', 'San Antonio', 'Monroe',
       'Fairfield', 'Grand Prairie', 'Redlands', 'Hamilton', 'Westfield',
       'Akron', 'Denver', 'Dallas', 'Whittier', 'Saginaw', 'Medina',
       'Dublin', 'Detroit', 'Tampa', 'Santa Clara', 'Lakeville',
       'San Diego', 'Brentwood', 'Chapel Hill', 'Morristown',
       'Cincinnati', 'Inglewood', 'Tamarac', 'Colorado Springs',
       'Belleville', 'Taylor', 'Lakewood', 'Arlington', 'Arvada',
       'Hackensack', 'Saint Petersburg', 'Long Beach', 'Hesperia',
       'Murfreesboro', 'Layton', 'Austin', 'Lowell', 'Manchester',
       'Harlingen', 'Tucson', 'Quincy', 'Pembroke Pines', 'Des Moines',
       'Peoria', 'Las Vegas', 'Warwick', 'Miami', 'Huntington Beach',
       'Richmond', 'Louisville', 'Lawrence', 'Canton', 'New Rochelle',
       'Gastonia', 'Jacksonville', 'Auburn', 'Norman', 'Park Ridge',
       'Amarillo', 'Lindenhurst', 'Huntsville', 'Fayetteville',
       'Costa Mesa', 'Parker', 'Atlanta', 'Gladstone', 'Great Falls',
       'Lakeland', 'Montgomery', 'Mesa', 'Green Bay', 'Anaheim',
       'Marysville', 'Salem', 'Laredo', 'Grove City', 'Dearborn',
       'Warner Robins', 'Vallejo', 'Mission Viejo', 'Rochester Hills',
       'Plainfield', 'Sierra Vista', 'Vancouver', 'Cleveland', 'Tyler',
       'Burlington', 'Waynesboro', 'Chester', 'Cary', 'Palm Coast',
       'Mount Vernon', 'Hialeah', 'Oceanside', 'Evanston', 'Trenton',
       'Cottage Grove', 'Bossier City', 'Lancaster', 'Asheville',
       'Lake Elsinore', 'Omaha', 'Edmonds', 'Santa Ana', 'Milwaukee',
       'Florence', 'Lorain', 'Linden', 'Salinas', 'New Brunswick',
       'Garland', 'Norwich', 'Alexandria', 'Toledo', 'Farmington',
       'Riverside', 'Torrance', 'Round Rock', 'Boca Raton',
       'Virginia Beach', 'Murrieta', 'Olympia', 'Washington',
       'Jefferson City', 'Saint Peters', 'Rockford', 'Brownsville',
       'Yonkers', 'Oakland', 'Clinton', 'Encinitas', 'Roswell',
       'Jonesboro', 'Antioch', 'Homestead', 'La Porte', 'Lansing',
       'Cuyahoga Falls', 'Reno', 'Harrisonburg', 'Escondido', 'Royal Oak',
       'Rockville', 'Coral Springs', 'Buffalo', 'Boynton Beach',
       'Gulfport', 'Fresno', 'Greenville', 'Macon', 'Cedar Rapids',
       'Providence', 'Pueblo', 'Deltona', 'Murray', 'Middletown',
       'Freeport', 'Pico Rivera', 'Provo', 'Pleasant Grove', 'Smyrna',
       'Parma', 'Mobile', 'New Bedford', 'Irving', 'Vineland', 'Glendale',
       'Niagara Falls', 'Thomasville', 'Westminster', 'Coppell', 'Pomona',
       'North Las Vegas', 'Allentown', 'Tempe', 'Laguna Niguel',
       'Bridgeton', 'Everett', 'Watertown', 'Appleton', 'Bellevue',
       'Allen', 'El Paso', 'Grapevine', 'Carrollton', 'Kent', 'Lafayette',
       'Tigard', 'Skokie', 'Plano', 'Suffolk', 'Indianapolis', 'Bayonne',
       'Greensboro', 'Baltimore', 'Kenosha', 'Olathe', 'Tulsa', 'Redmond',
       'Raleigh', 'Muskogee', 'Meriden', 'Bowling Green', 'South Bend',
       'Spokane', 'Keller', 'Port Orange', 'Medford', 'Charlottesville',
       'Missoula', 'Apopka', 'Reading', 'Broomfield', 'Paterson',
       'Oklahoma City', 'Chesapeake', 'Lubbock', 'Johnson City',
       'San Bernardino', 'Leominster', 'Bozeman', 'Perth Amboy',
       'Ontario', 'Rancho Cucamonga', 'Moorhead', 'Mesquite', 'Stockton',
       'Ormond Beach', 'Sunnyvale', 'York', 'College Station',
       'Saint Louis', 'Manteca', 'San Angelo', 'Salt Lake City',
       'Knoxville', 'Little Rock', 'Lincoln Park', 'Marion', 'Littleton',
       'Bangor', 'Southaven', 'New Castle', 'Midland', 'Sioux Falls',
       'Fort Collins', 'Clarksville', 'Sacramento', 'Thousand Oaks',
       'Malden', 'Holyoke', 'Albuquerque', 'Sparks', 'Coachella',
       'Elmhurst', 'Passaic', 'North Charleston', 'Newport News',
       'Jamestown', 'Mishawaka', 'La Quinta', 'Tallahassee', 'Nashville',
       'Bellingham', 'Woodstock', 'Haltom City', 'Wheeling',
       'Summerville', 'Hot Springs', 'Englewood', 'Las Cruces', 'Hoover',
       'Frisco', 'Vacaville', 'Waukesha', 'Bakersfield', 'Pompano Beach',
       'Corpus Christi', 'Redondo Beach', 'Orlando', 'Orange',
       'Lake Charles', 'Highland Park', 'Hempstead', 'Noblesville',
       'Apple Valley', 'Mount Pleasant', 'Sterling Heights', 'Eau Claire',
       'Pharr', 'Billings', 'Gresham', 'Chattanooga', 'Meridian',
       'Bolingbrook', 'Maple Grove', 'Woodland', 'Missouri City',
       'Pearland', 'San Mateo', 'Grand Rapids', 'Visalia',
       'Overland Park', 'Temecula', 'Yucaipa', 'Revere', 'Conroe',
       'Tinley Park', 'Dubuque', 'Dearborn Heights', 'Santa Fe',
       'Hickory', 'Carol Stream', 'Saint Cloud', 'North Miami',
       'Plantation', 'Port Saint Lucie', 'Rock Hill', 'Odessa',
       'West Allis', 'Chula Vista', 'Manhattan', 'Altoona', 'Thornton',
       'Champaign', 'Texarkana', 'Edinburg', 'Baytown', 'Greenwood',
       'Woonsocket', 'Superior', 'Bedford', 'Covington', 'Broken Arrow',
       'Miramar', 'Hollywood', 'Deer Park', 'Wichita', 'Mcallen',
       'Iowa City', 'Boise', 'Cranston', 'Port Arthur', 'Citrus Heights',
       'The Colony', 'Daytona Beach', 'Bullhead City', 'Portage', 'Fargo',
       'Elkhart', 'San Gabriel', 'Margate', 'Sandy Springs', 'Mentor',
       'Lawton', 'Hampton', 'Rome', 'La Crosse', 'Lewiston',
       'Hattiesburg', 'Danville', 'Logan', 'Waterbury', 'Athens',
       'Avondale', 'Marietta', 'Yuma', 'Wausau', 'Pasco', 'Oak Park',
       'Pensacola', 'League City', 'Gaithersburg', 'Lehi', 'Tuscaloosa',
       'Moreno Valley', 'Georgetown', 'Loveland', 'Chandler', 'Helena',
       'Kirkwood', 'Waco', 'Frankfort', 'Bethlehem', 'Grand Island',
       'Woodbury', 'Rogers', 'Clovis', 'Jupiter', 'Santa Barbara',
       'Cedar Hill', 'Norfolk', 'Draper', 'Ann Arbor', 'La Mesa',
       'Pocatello', 'Holland', 'Milford', 'Buffalo Grove', 'Lake Forest',
       'Redding', 'Chico', 'Utica', 'Conway', 'Cheyenne', 'Owensboro',
       'Caldwell', 'Kenner', 'Nashua', 'Bartlett', 'Redwood City',
       'Lebanon', 'Santa Maria', 'Des Plaines', 'Longview',
       'Hendersonville', 'Waterloo', 'Cambridge', 'Palatine', 'Beverly',
       'Eugene', 'Oxnard', 'Renton', 'Glenview', 'Delray Beach',
       'Commerce City', 'Texas City', 'Wilson', 'Rio Rancho', 'Goldsboro',
       'Montebello', 'El Cajon', 'Beaumont', 'West Palm Beach', 'Abilene',
       'Normal', 'Saint Charles', 'Camarillo', 'Hillsboro', 'Burbank',
       'Modesto', 'Garden City', 'Atlantic City', 'Longmont', 'Davis',
       'Morgan Hill', 'Clifton', 'Sheboygan', 'East Point', 'Rapid City',
       'Andover', 'Kissimmee', 'Shelton', 'Danbury', 'Sanford',
       'San Marcos', 'Greeley', 'Mansfield', 'Elyria', 'Twin Falls',
       'Coral Gables', 'Romeoville', 'Marlborough', 'Laurel', 'Bryan',
       'Pine Bluff', 'Aberdeen', 'Hagerstown', 'East Orange',
       'Arlington Heights', 'Oswego', 'Coon Rapids', 'San Clemente',
       'San Luis Obispo', 'Springdale', 'Lodi', 'Mason'], dtype=object)
In [12]:
t3['State'].unique() # gives the unique values into the column State
Out[12]:
array(['Kentucky', 'California', 'Florida', 'North Carolina',
       'Washington', 'Texas', 'Wisconsin', 'Utah', 'Nebraska',
       'Pennsylvania', 'Illinois', 'Minnesota', 'Michigan', 'Delaware',
       'Indiana', 'New York', 'Arizona', 'Virginia', 'Tennessee',
       'Alabama', 'South Carolina', 'Oregon', 'Colorado', 'Iowa', 'Ohio',
       'Missouri', 'Oklahoma', 'New Mexico', 'Louisiana', 'Connecticut',
       'New Jersey', 'Massachusetts', 'Georgia', 'Nevada', 'Rhode Island',
       'Mississippi', 'Arkansas', 'Montana', 'New Hampshire', 'Maryland',
       'District of Columbia', 'Kansas', 'Vermont', 'Maine',
       'South Dakota', 'Idaho', 'North Dakota', 'Wyoming',
       'West Virginia'], dtype=object)
In [13]:
t3['Region'].unique() # gives the unique values into the column Region
Out[13]:
array(['South', 'West', 'Central', 'East'], dtype=object)
In [14]:
t3['Category'].unique() # gives the unique values into the column Category
Out[14]:
array(['Furniture', 'Office Supplies', 'Technology'], dtype=object)
In [15]:
t3['Sub-Category'].unique() # gives the unique values into the column Sub-Category
Out[15]:
array(['Bookcases', 'Chairs', 'Labels', 'Tables', 'Storage',
       'Furnishings', 'Art', 'Phones', 'Binders', 'Appliances', 'Paper',
       'Accessories', 'Envelopes', 'Fasteners', 'Supplies', 'Machines',
       'Copiers'], dtype=object)
In [16]:
t3['Sales'].unique() # gives the unique values into the column Sales
Out[16]:
array([261.96 , 731.94 ,  14.62 , ..., 437.472,  97.98 , 243.16 ])
In [17]:
t3['Quantity'].unique() # gives the unique values into the column Quantity
Out[17]:
array([ 2,  3,  5,  7,  4,  6,  9,  1,  8, 14, 11, 13, 10, 12],
      dtype=int64)
In [18]:
t3['Discount'].unique() # gives the unique values into the column Discount
Out[18]:
array([0.  , 0.45, 0.2 , 0.8 , 0.3 , 0.5 , 0.7 , 0.6 , 0.32, 0.1 , 0.4 ,
       0.15])
In [19]:
t3['Profit'].unique() # gives the unique values into the column Profit
Out[19]:
array([ 41.9136, 219.582 ,   6.8714, ...,  16.124 ,   4.1028,  72.948 ])
In [20]:
t3.isna().sum() # Shows the sum of NA values in respective colummns
Out[20]:
Ship Mode       0
Segment         0
Country         0
City            0
State           0
Postal Code     0
Region          0
Category        0
Sub-Category    0
Sales           0
Quantity        0
Discount        0
Profit          0
dtype: int64
In [21]:
t3.isnull()
Out[21]:
Ship Mode Segment Country City State Postal Code Region Category Sub-Category Sales Quantity Discount Profit
0 False False False False False False False False False False False False False
1 False False False False False False False False False False False False False
2 False False False False False False False False False False False False False
3 False False False False False False False False False False False False False
4 False False False False False False False False False False False False False
... ... ... ... ... ... ... ... ... ... ... ... ... ...
9989 False False False False False False False False False False False False False
9990 False False False False False False False False False False False False False
9991 False False False False False False False False False False False False False
9992 False False False False False False False False False False False False False
9993 False False False False False False False False False False False False False

9994 rows Γ— 13 columns

In [22]:
t3.isna().any()
Out[22]:
Ship Mode       False
Segment         False
Country         False
City            False
State           False
Postal Code     False
Region          False
Category        False
Sub-Category    False
Sales           False
Quantity        False
Discount        False
Profit          False
dtype: bool

Data Visualization using Plotly and SeabornΒΆ

In [23]:
t3.corr() # Show the correlation of the columns with each other
Out[23]:
Postal Code Sales Quantity Discount Profit
Postal Code 1.000000 -0.023854 0.012761 0.058443 -0.029961
Sales -0.023854 1.000000 0.200795 -0.028190 0.479064
Quantity 0.012761 0.200795 1.000000 0.008623 0.066253
Discount 0.058443 -0.028190 0.008623 1.000000 -0.219487
Profit -0.029961 0.479064 0.066253 -0.219487 1.000000
In [24]:
sns.heatmap(t3.corr(),annot=True)
Out[24]:
<AxesSubplot:>
In [25]:
sns.pairplot(t3,hue='Region')
Out[25]:
<seaborn.axisgrid.PairGrid at 0x128ba59f880>
In [26]:
# creating histograms to visualize all the data
fig = plt.figure(figsize = (40,40))
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)

ax = fig.gca()
t3.hist(ax = ax)
Out[26]:
array([[<AxesSubplot:title={'center':'Postal Code'}>,
        <AxesSubplot:title={'center':'Sales'}>],
       [<AxesSubplot:title={'center':'Quantity'}>,
        <AxesSubplot:title={'center':'Discount'}>],
       [<AxesSubplot:title={'center':'Profit'}>, <AxesSubplot:>]],
      dtype=object)
In [27]:
plt.figure(figsize=(15,15))
sns.countplot(x=t3['State'])
plt.xticks(rotation=90)
plt.title("State")
plt.show()
In [28]:
plt.figure(2, figsize=(20,15))
sns.barplot(x=t3['Category'],
           y=t3['Profit'].values,
           data = t3)
plt.xticks(rotation= 70)
plt.title('Category/Profit')
plt.xlabel('Category')
plt.ylabel('Profit')
plt.show()
In [29]:
plt.figure(2, figsize=(20,15))
sns.barplot(x=t3['Sub-Category'],
           y=t3['Profit'].values,
           data = t3)
plt.xticks(rotation= 70)
plt.title('Sub-Category/Profit')
plt.xlabel('Sub-Category')
plt.ylabel('Profit')
plt.show()
In [30]:
plt.figure(2, figsize=(20,15))
sns.barplot(x=t3['Sub-Category'],
           y=t3['Discount'].values,
           data = t3)
plt.xticks(rotation= 70)
plt.title('Sub-Category/Discount')
plt.xlabel('Sub-Category')
plt.ylabel('Discount')
plt.show()
In [31]:
plt.figure(2, figsize=(20,15))
sns.barplot(x=t3['Sub-Category'],
           y=t3['Sales'].values,
           data = t3)
plt.xticks(rotation= 70)
plt.title('Sub-Category/Sales')
plt.xlabel('Sub-Category')
plt.ylabel('Sales')
plt.show()
In [32]:
fig = go.Figure(
    data=[go.Bar(x= t3['Sub-Category'],y= t3['Profit'])],
    layout_title_text= 'Sub-Category Wise Profit'
)
fig.show()
BookcasesChairsLabelsTablesStorageFurnishingsArtPhonesBindersAppliancesPaperAccessoriesEnvelopesFastenersSuppliesMachinesCopiersβˆ’40kβˆ’20k020k40k60k
Sub-Category Wise Profit
In [33]:
fig = go.Figure(
    data=[go.Bar(x= t3['Category'],y= t3['Profit'])],
    layout_title_text= 'Category Wise Profit'
)
fig.show()
FurnitureOffice SuppliesTechnologyβˆ’50k050k100k150k
Category Wise Profit
In [34]:
fig = px.bar(t3, x="Sub-Category", y="Profit", color="Segment", title="Segment Wise Sub-Category/Profit")
fig.show()
BookcasesChairsTablesStorageFurnishingsArtPhonesBindersAppliancesPaperAccessoriesEnvelopesLabelsFastenersSuppliesMachinesCopiersβˆ’40kβˆ’20k020k40k60k
SegmentConsumerCorporateHome OfficeSegment Wise Sub-Category/ProfitSub-CategoryProfit

Copiers has non-negative profit

Phone has more profit magin base on segment

Fasteners has lowest margin base on segment

We should give more attention on Supplies,Tables

In [35]:
fig = px.bar(t3, x="Sub-Category", y="Profit", color="Region", title="Region wise Sub-Category/Profit")
fig.show()
BookcasesChairsTablesStoragePaperFurnishingsAppliancesBindersEnvelopesAccessoriesPhonesArtSuppliesLabelsFastenersMachinesCopiersβˆ’40kβˆ’20k020k40k60k
RegionSouthWestCentralEastRegion wise Sub-Category/ProfitSub-CategoryProfit

In Copiers region Wise east give more profit than central, west and south

In phone sub-category east region is in more profit than central, west and south

In [36]:
fig = px.bar(t3, x="Sub-Category", y="Discount", color="Region", title="Region wise Sub-Category/Discount")
fig.show()
BookcasesChairsTablesStoragePaperFurnishingsAppliancesBindersEnvelopesAccessoriesPhonesArtSuppliesLabelsFastenersMachinesCopiers0100200300400500
RegionSouthWestCentralEastRegion wise Sub-Category/DiscountSub-CategoryDiscount
In [37]:
A =t3['Region'].value_counts()
A
B = ['West','East','Central','South']

Region CountΒΆ

In [38]:
trace = go.Pie(labels = B , values = A,)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)
In [39]:
A1 =t3['Segment'].value_counts()
A1
B1 = ['Consumer','Corporate','Home Office']

Segment CountΒΆ

In [40]:
trace = go.Pie(labels = B1 , values = A1,)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)
In [41]:
#Boxpot
fig = px.box(t3, y="Profit",color="Region",title="Boxplot of Profit",template="none")
fig.show()
βˆ’6000βˆ’4000βˆ’200002000400060008000
RegionSouthWestCentralEastBoxplot of ProfitProfit

South has the Maximun positive profit 3,177.475 and Negavtive profit margin -3,839.99

West has the Maximun positive profit 6,719.981 and Negavtive profit margin -3,399.98

Central has the Maximun positive profit 8,399.976 and Negavtive profit margin -3,701.893

East has the Maximun positive profit 5,039.986 and Negavtive profit margin -6,599.978

In [42]:
# Scatter Plot
px.scatter(t3,x="Sales", y="Profit", color="Region",
           title="Sales vs Profit")
05k10k15k20kβˆ’6000βˆ’4000βˆ’200002000400060008000
RegionSouthWestCentralEastSales vs ProfitSalesProfit
In [43]:
# Scatter Plot  


px.scatter(t3,x="Discount", y="Profit", color="Region",
           title="Discount vs Profit")
00.10.20.30.40.50.60.70.8βˆ’6000βˆ’4000βˆ’200002000400060008000
RegionSouthWestCentralEastDiscount vs ProfitDiscountProfit
In [44]:
fig = px.bar(t3, x="Region", y="Profit", color="Segment", title="Segment vise Region/Profit")
fig.show()
SouthWestCentralEastβˆ’50k050k100k150k
SegmentConsumerCorporateHome OfficeSegment vise Region/ProfitRegionProfit
In [45]:
fig = px.bar(t3, x="Region", y="Sales", color="Segment", title="Segment vise Region/Profit")
fig.show()
SouthWestCentralEast0100k200k300k400k500k600k700k
SegmentConsumerCorporateHome OfficeSegment vise Region/ProfitRegionSales

ConclusionΒΆ

Tables should reduce there discount rate because graph show there profit is in loss this solution goes with Bookcases and Supplies

As a Business manager, He/she should gives more attention on fasteners Sub-Category in south region try to give some discount for increase profit

In South region sales is less than other region as a Business Manager He/She should give attention on Sales in south region